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Abstract 


Universal Turing machines are a well-known concept 
in computer science. Most often, universal Turing 
machines are constructed by humans, and designed 
with particular features in mind. This is especially 
true of recent efforts to find small universal Turing 
machines, as these are often the product of sophisti- 
cated human reasoning. In this paper we take a dif- 
ferent approach, in that we investigate how we can 
search through a number of Turing machines and 
recognise universal ones. This means that we have 
to examine very carefully the concepts involved, in- 
cluding the notion of what it means for a Turing ma- 
chine to be universal, and what implications there are 
for the way that Turing machines are coded as input 
strings. 


1 Introduction 


Universal Turing machines (Turing 1936) are a well- 
known concept in computer science, and are most 
commonly used as a mathematical model of compu- 
tation. Often, especially in textbooks, the intuitive, 
informal idea of a universal machine is all that is ad- 
dressed (and for that matter, all that is necessary), in 
order to be able to prove undecidability results. With 
the benefit of 70-odd years of hindsight (and around 
60 years of general purpose computing), the idea it- 
self does not seem particularly difficult to grasp, and 
is often presented as an intuitively natural step from 
previous work. 

Since Turing’s original work, and particularly in 
recent times, there has been ongoing interest in find- 
ing small universal Turing machines (Minksy 1962, 
Watanabe 1972, Rogozhin 1996, Neary and Wood 
2007). One particularly well-known effort is that of 
Minsky (Minksy 1962), who showed the existence of 
a 7-state 4-symbol machine that was universal. How- 
ever, as Minsky himself pointed out, it is not at all 
obvious how one would ‘program’ this universal ma- 
chine, nor indeed whether it is possible to identify 
this machine as universal just by inspecting it. Since 
then, a number of existence and non-existence results 
have been shown. These include the results that there 
are no universal Turing machines with 1 state, nor 
with 2 states and 2 symbols (Minksy 1962), nor 3 
states with 2 symbols (Pavlotskaya 1978). There are 
also some results on more sophisticated measures of 
simplicity that merely the number of states (Calude 


Copyright ©2011, Australian Computer Society, Inc. This pa- 
per appeared at the 17th Computing: The Australasian The- 
ory Symposium (CATS 2011), Perth, Australia, January 2011. 
Conferences in Research and Practice in Information Technol- 
ogy (CRPIT), Vol. 119, Alex Potanin and Taso Viglas, Ed. 
Reproduction for academic, not-for profit purposes permitted 
provided this text is included. 


2008). Woods and Neary (Woods and Neary 2009) 
have given an excellent survey of the known results. 

The notion of universal computation has become 
widespread within the culture of computer science’, 
and the idea is often only discussed informally. This is 
usually done in the context of introducing the concept 
of universality in order to prove some undecidability 
results, such as the undecidability of the halting prob- 
lem for Turing machines. However, the formal notion 
of universality is more difficult to find in the litera- 
ture, and perhaps surprisingly, there is more than one 
precise notion. Essentially this difference is whether a 
universal machine has to precisely simulate the entire 
computation of the original machine (Fischer 1965), 
or whether it is sufficient to simulate the calculation 
of the output of the original machine from its input 
(Herman 1968). In the latter case, the universal ma- 
chine does not need to simulate every step of the orig- 
inal computation, but can skip some steps, provided 
that the final resultt is the same. This difference can 
be thought of as a reflection of the differences be- 
tween intuitive notions of computation, i.e. whether 
a computation computes an output, or carries out a 
process. 

A different perspective on the problem of find- 
ing universal machines was given by Wolfram (Wol- 
fram 2002), who performed an automated search 
for universal Turing machines (and for universality 
amongst other computational models such as cellular 
automata). This involved generating a large number 
of Turing machines and applying some criterion for 
evaluating whether or not it was universal. In Wol- 
fram’s case, this criterion was whether or not the out- 
put of the machine was ‘inherently complex’ or not, 
i.e. it was not a straightforward, regular pattern, but 
contained some seemingly random elements. 

Whilst this approach is commendable, and is sim- 
ilar in spirit to automated searches to find values for 
the busy beaver function (Rado 1963, Lin and Rado 
1964, Brady 1983, Marxen and Buntrock 1990), the 
criterion used essentially relies on human judgement, 
and was not specifically defined, which makes the re- 
sults difficult to analyse (and reproduce). However, 
it does appear to be the first systematic attempt to 
search for universal machines, rather than to con- 
struct them. 

A further issue raised by Wolfram’s search is that 
his universal machines necessarily do not terminate; 
in order to simulate a terminating computation, the 
final configuration of the terminating computation is 
repeated indefinitely in the simulation. This adds an 
extra dimension to the discussion of universality, in 
that it needs to be established in advance whether 
this is a reasonable property, or whether it is appro- 
priate to insist that a terminating computation can 
only be simulated by a terminating computation on 


1Dare one say that the concept of universal machines has be- 
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the universal machine. 

In this paper, we discuss how to perform a system- 
atic search for universal Turing machines, but with 
more explicit criteria. This means we need to exam- 
ine the notion of universal machine rather closely. In 
particular, we need to be precise about exactly what 
it means to be universal, and what precise criteria 
can be applied in an automated search. However, as 
we shall see, this is quite a large undertaking; not 
only are there various issues in the definition of uni- 
versality to be dealt with, including the definitions of 
appropriate encoding functions, there are also a num- 
ber of difficulties in defining a universality criterion 
which is sufficiently precise to be able to completely 
determine the universality of a given machine. In par- 
ticular, it seems likely that the best that we can hope 
for is a pseudo-universality test, similar in spirit to 
pseudo-primality tests, which can be used to defini- 
tively determine that a given number is not prime, 
but can only establish primality with a level of uncer- 
tainty. 

The main contribution of this paper is to set up a 
framework in which to investigate universality proper- 
ties of classes of machines. This paper is organised as 
follows. In Section 2, we provide a detailed discussion 
of the issues involved in the definition of universality, 
and in Section 3 we give our explicit choices of Turing 
machine and related formal issues. In Section 4, we 
discuss how to generate the classes of candidate ma- 
chines, and in Section 5 we discuss issues relating to 
the encoding of machines. In Section 6 we provide a 
summary of this discussion in terms of a list of issues 
to be resolved, and in Section 7 we discuss various 
outstanding issues. Finally in Section 8 we present 
our conclusions and areas of further work. 


2 Discovering Universal Machines 


2.1 Definitions of Universality 


Alan Turing introduced the formal model of compu- 
tation which is now known as Turing machines (Tur- 
ing 1936), and showed that it is possible to construct 
a universal Turing machine. This machine takes an- 
other Turing machine as input and is able to simulate 
the computation of the original machine with a suit- 
able input to the universal machine. Whilst the main 
purpose for Turing’s introduction of the universal ma- 
chine was to show the halting problem for Turing ma- 
chines is undecidable, there has been interest since 
then in finding small universal Turing machines. This 
has generally been done by construction, i.e. by find- 
ing particular machines which are increasingly small. 
As mentioned above, there are various results known 
about small universal Turing machines, and there 
is ongoing work aimed at ‘closing the gap’ between 
the largest known classes of machines in which there 
are no universal machines, and the smallest known 
universal Turing machines (Woods and Neary 2009). 
Whilst the ongoing search for small universal Turing 
machines is a fascinating and ongoing research topic, 
it is not the focus of this paper. Here we are con- 
cerned with how we can discover universal machines 
within a given set of machines. 

In order to perform a search along the lines of 
Wolfram’s, there are a number of issues that need 
to be addressed. A first observation is that this 
search will be similar to searches undertaken to solve 
the busy beaver problem (Rado 1963, Lin and Rado 
1964, Brady 1983, Marxen and Buntrock 1990, Har- 
land 2006, 2007). The busy beaver problem is to find 
the largest number of 1’s that can be printed by a ter- 
minating Turing machine with no more than n states 
on the blank input (often referred to as the produc- 
tivity of the machine). Hence the search for a busy 
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beaver (and related concepts such as the placid platy- 
pus (Harland 2006, Batfai 2009)) involves generating 
all machines in the class, evaluating whether or not 
each machine terminates on the blank input”, and 
finding the maximum productivity of the terminat- 
ing machines. It is well-known that the busy beaver 
function is not computable (Rado 1963); however, as 
there are only a finite? number of machines of a given 
size, it is possible to compute the value of the busy 
beaver function for any given input value. 

To apply a similar process to finding universal ma- 
chines, we need to be able to determine whether a 
given machine is universal or not. This is a different 
perspective from the way that universal machines are 
usually presented in textbooks, as universal machines 
are normally produced by construction, together with 
an appropriate encoding function. However, in or- 
der to search for universal machines amongst series 
of Turing machines, it is necessary to develop a par- 
ticular criterion for evaluating the universality or oth- 
erwise of a given machine. 

One issue that arises in the contemplation of such 
a criterion is that not only is it comparatively rare to 
find a formal definition of a universal Turing machine, 
but that there are at least two such definitions. One 
may be termed result-oriented, in that given a Tur- 
ing machine M with an input w, a universal machine 
U need only “agree” with the result of the computa- 
tion of M on w (Neary and Wood 2007, Priese 1979, 
Herman 1968, Davis 1956, 1957). In other words, a 
machine U is universal if for any machine M and in- 
put w to M, there is an input w’ to U such that the 
result of the computation of U on w’ is an encoding 
of the output of M on w. In particular, it is not nec- 
essary for each step of the computation of U on w’ 
to ‘mimic’ every step of the computation of M on w. 
Note that w’ is anything but arbitrary; typically w’ is 
constructed in a very precise manner from M and w. 
The other definition, which may be termed process- 
oriented, insists that each step of the computation of 
M on w is in fact mimicked by the computation of 
U on w’, so that for each configuration in the com- 
putation of M on w, there is a corresponding unique 
configuration in the computation of U on w (Fischer 
1965). It is not hard to see that a machine which is 
universal in the process-oriented sense must also be 
universal in the result-oriented sense, as in the former 
case, we certainly require that the final configuration 
in the computation of M on w corresponds to the 
final configuration of the computation of U on w. 

The variance between these definitions is perhaps 
not all that surprising, in that in order to define a 
universal machine, it is necessary to give a precise 
definition of computation, and at least in an intu- 
itive sense, computation can be thought of as either 
a process which takes a given input and computes a 
specific output, or as a process which follows a par- 
ticular series of steps, and which may not necessarily 
terminate (and in fact not terminate by design, rather 
than as a consequence of the existence of undecidable 
problems). This same variance of opinion is at the 
heart of the two different definitions of universality 
(although it must be said that the result-oriented ap- 
proach seems to have achieved a recent consensus). 
This view is perhaps best summarised in the quota- 
tion below. 


“.. the purpose of the simulation is to reproduce the 
input-output relation of the simulated machine.” 


— Gabor Herman, (Herman 1968), page 813. 


?Note that as there are only a finite number of machines in- 
volved, this is a decidable problem; see the discussion on the finite 
decision anomaly in 87. 

3 but hyperfactorial, i.e. O(n”) 
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However, if we think of computation as a process, 
which includes the possibility of non-termination, 
then it seems more natural to adopt the process- 
oriented definition. In particular, the result-oriented 
definition treats all non-terminating computations as 
the same (and uninteresting), which may not seem 
intuitive for software systems such as web servers, or 
operating systems, or other software which does not 
fit neatly into what one may call an algorithmic view 
of computation. In addition, the result-oriented ap- 
proach has the effect of ignoring differences between 
computations which produce the same output. For 
example, if we only consider the input-output be- 
haviour of programs, then all sorting algorithms are 
effectively considered to be the same. To be fair, 
the result-oriented view isn’t always this extreme, in 
that it is not unreasonable to allow a universal ma- 
chine to ‘skip’ a few steps in the original computation. 
However, it seems difficult to define precisely what is 
meant by such harmless skipping without also allow- 
ing the entire computation of M on w to be simulated 
by a single step of U on w’. Of course it is unlikely 
that such a universal machine exists (i.e. one that sim- 
ulates the computation of M on input w with a single 
step); the point here is that before we can begin to 
search for such machines, we need to consider care- 
fully what exactly we are looking for. Also, one lesson 
that can be learned from the search for busy beavers 
is that human construction of machines is generally 
no match for those that can be found by systematic 
searches, particularly when there are restrictions on 
the size of the machines that may be used. Hence it 
would seem best to avoid as much as possible relying 
on assumptions that certain properties are ‘unlikely’. 
To exaggerate to make the point, the productivities of 
the 6-state 2-symbol busy beaver candidates seemed 
‘unlikely’ until they were found.* 


2.2 Termination 


Another issue that needs to be addressed is whether 
or not a universal machine must mimic the termina- 
tion behaviour of the original machine. More par- 
ticularly, must the simulation of a terminating com- 
putation terminate? (and, for that matter, must the 
simulation of a non-terminating computation also not 
terminate?). As mentioned above, it is not possible 
for Wolfram’s machines to terminate, as there is no 
halt transition. This raises the issue of whether a 
universal machine must be of the same ‘type’ as the 
macr hines it simulates. There is an intuitive sense 
in which a universal machine must be able to simu- 
late its own computations (as otherwise it is in some 
sense not universal, as there is at least one machine 
whose computations it doesn’t simulate). However, 
this can also be viewed as asking over what class of 
machines the universal machine is expected to oper- 
ate. In terms of simulating all computations for all 
machines in a given class, it is not obvious that the 
only class of interest is all possible machines. For ex- 
ample, it may be of interest to consider only machines 
of a certain computational complexity, or which per- 
form a specific set of algorithms (such as arithmetic 
operations, or sorting algorithms). It is certainly nat- 
ural to require that a universal machine be able to 
simulate all possible machines, including itself, but 
again this may be making assumptions which, while 
they appear natural, may not be justified. Another 
way to put this is that universality, in its intuitive 
sense, is one extreme; it may be that a given ma- 
chine will simulate some computations and not oth- 
ers, and so a universal machine may be considered as 
one kind of meta-machine (i.e. a machine which takes 


The best known such productivity is currently 1018:2768 (11). 


another machine as input). It is intriguing to consider 
other possibilities for interesting meta-machines, but 
in this paper we will consider only universal machines 
(whichever precise definition is used). We will refer to 
universal machines whose termination behaviour pre- 
cisely matches that of every machine which is input to 
it as faithful; in other words, a faithful universal ma- 
chine must terminate when the machine it simulates 
terminates, and not terminate when the machine it is 
simulating does not terminate. 


2.3 Encoding Machines 


Another issue that needs to be considered is how the 
machine to be simulated is encoded as an input to 
the universal machine. As noted by Herman (Her- 
man 1968) (see below), it is important that the cod- 
ing function serve only to translate the machine into 
a form that can be read by another machine. 


“This is to stop the encoding and decoding of 
algorithm to carry out the real computational work” 


— Gabor Herman, (Herman 1968), page 813. 


Similar concerns are discussed by Colmerauer 
(Colmerauer 2004). This occurs in his description 
of his development of a very sophisticated universal 
Turing machine, which is notable not only for be- 
ing explicitly defined and well-documented, but also 
in the amount of effort put into its design, mak- 
ing it arguably the most sophisticated universal Tur- 
ing machine known. Colmerauer’s concerns are not 
about making universal Turing machines small, but 
about providing an appropriate metric for measuring 
the overhead involved in simulating one machine by 
another. Specifically, Colmerauer aims to construct 
a universal machine which minimises this particular 
metric. The details of this machine are beyond the 
scope of this paper, but much of Colmerauer’s anal- 
ysis is relevant, and so we will discuss its important 
aspects. 

Colmerauer also discusses the encoding function in 
some detail, and is in particular concerned about cod- 
ing functions that may ‘cheat’, such as by recognising 
particular machines, and encoding these differently 
from other machines. It seems natural that one way 
to avoid any such problems is to place some particu- 
lar requirements on coding functions. In particular, 
it seems reasonable to require that the function it- 
self must be defined homomorphically, i.e. in terms of 
its effects on the states, symbols and other elements 
of the machine, and let these definitions determine 
the overall encoding of the machine. This makes it 
impossible for the encoding recognise particular ma- 
chines, or to perform ‘the real computational work’, 
as all it is doing is to translate the nuts-and-bolts of 
one machine into the input language of another. 

Another important observation that Colmerauer 
makes is that the encoding itself is bound up in the 
definition of universality. In particular, the design 
process for building a universal machine usually com- 
mences with defining an appropriate encoding, which 
is then used throughout the universal machine. It is 
difficult to imagine a machine that has been defined 
with one encoding in mind being sufficiently robust 
to also be universal for a different encoding, no mat- 
ter how similar. Put another way, we can hardly ex- 
pect a machine to retain the seemingly brittle and 
scarce property of universality if we were to change 
the encoding on which it depends, but not the ma- 
chine itself. For these reasons Colmerauer refers to 
a universal machine and its encoding function as a 
universal pair. 
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It is also notable that Colmerauer defines what we 
may call a particular architecture for the way in which 
the universal machine works. In a nutshell, given a 
machine M and an input w, which is assumed to be 
using the same symbols as the universal machine U, 
the input to U which is used to simulate the compu- 
tation of M on w is code(M)w where code() is the en- 
coding function. The important point to note is that 
we have made a particular assumption about the way 
in which the universal machine will read and organise 
its input (and in particular that the initial state of U 
will be pointing to the first symbol to the left of the 
string code(M)w). When universal machines are con- 
structed by hand, this is entirely natural; when we are 
searching for universal machines amongst a number of 
machines, it is not so obvious that it is appropriate 
to be so specific about how the machine is organised. 
For example, what if there is a machine which ex- 
pects its input in the form w code(M)? This may 
seem to be a rather vacuous objection, in that clearly 
the universal machine needs to be able to access the 
information about M and w somehow. However, we 
again need to be careful about making unwarranted 
assumptions. 

This may indicate that an appropriate response is 
to not specify exactly what the input to the universal 
machine should be. In this case, a machine U would 
be universal if for each machine M and input w there 
is a w’ such that M on w is simulated by the com- 
putation of U on w’. However, this may allow some 
trivial machines to be universal, as if w’ is just the 
output of M on w (assuming M terminates on w), 
and U is the machine which halts immediately, then 
clearly the output of U on w’ is the same as the out- 
put of M on w. Note also that this is only an issue for 
the result-oriented approach; in the process-oriented 
approach, even if w’ is an encoding of the result of M 
on w, we are still required to perform a computation 
of U on w’ for which each state of the computation 
of M on w is represented by a unique configuration 
in the computation of U on w’. 

Hence it seems reasonable to require that the input 
w’ to U be an encoding of M and w. The main issue 
arises when the test for universality fails (which we 
expect will be most of the time). In particular, it is 
not obvious whether it is reasonable to try other per- 
mutations of the input or not (such as w code(M)). 
It should also be noted that it is possible to have 
variations on universality such as weak universality 
(Woods and Neary 2009), in which an infinite num- 
ber of copies of a particular string are kept on the 
tape. Whilst we do not pursue this and similar op- 
tions here, it is worth noting that such variations can 
be thought of as one aspect of what we have termed 
the architecture of the universal machine. 


2.4 Pseudo-universality 


Perhaps the most interesting aspect of Colmerauer’s 
architecture is the following reasoning. Colmerauer 
defines a universal machine U as one that for any 
M and w, U on code(M)w simulates M on w for 
some appropriate function code(). Now as M can be 
any Turing machine, M could in fact be U, and so if 
M =U, we must have that for any w, U on code(U)w 
simulates U on w, and so Uq on code(U)code(U)w 
simulates U on code(U)w which simulates U on w. 
By extending this argument, we have that U on 
code*(U)w simulates U on w for all k > 1. 
Colmerauer then uses this property to define his 
metric. In our case, the interesting aspect of this 
observation is that we can use this property together 
with an analysis of busy beaver candidates to arrive 
at a “pseudo-universality” test. As U on code*(U)w 
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simulates U on w for all k > 1, then when w is the 
blank tape (written here as U), then we have U on 
code*(U)U simulates U on U for all k > 1. In other 
words, for a universal machine U, U on code*(U)U 
simulates the computation of U on the blank input, 
which is exactly what is tested for in the busy beaver 
analysis. Hence we arrive at the following definition. 


Definition 1 A deterministic Turing machine M 
is pseudo-universal with respect to code() if M on 


code*(M)U simulates M on U for all k > 1. 


Clearly any universal machine will be pseudo- 
universal, but not necessarily vice-versa. This is in- 
tentionally similar to pseudo-primality tests, which 
are based on necessary (but not sufficient) properties 
of primes. In this case we need only find some j for 
which M on code? (M)U does not simulate M on U to 
show that M is not pseudo-universal. As we expect 
universal machines to be rare, and any machine uni- 
versal with respect to code() will be pseudo-universal, 
we expect a relative small number of machines to pass 
the pseudo-universality test. In this way, Colmer- 
auer’s property leads to one notion of a criterion for 
testing for universality (or more specifically, for non- 
universality). It seems natural to measure the qual- 
ity of the pseudo-universality test by measuring the 
number of machines that are pseudo-universal but not 
universal (and clearly the fewer there are, the better 
the test is). 


3 Formal definitions 


Before examining some of the issues discussed above 
in more detail, we give some formal definitions, in- 
cluding a precise specification of the Turing machines 
that we will be using. 

We use the following definition of a Turing ma- 
chine (Sudkamp 2005). 


Definition 2 (Sudkamp) A Turing machine is a 
quadruple (Q U {h},T,6,qo) where 


e Q is a finite set of states (and h ¢ Q) 
e h is a distinguished state called a halting state 
e T is the (finite) tape alphabet 


ô is a partial function from Q xT to QU {h} x 
T x {l,r} called the transition function 


qo E Q is a distinguished state called the start 
state 


In this paper, Q is {a,b,c,d,...} unless otherwise 
specified, where a is the initial state (i.e. go is labelled 
a). 
Note that this is the so-called quintuple transi- 
tion variation of Turing machines, in that a transition 
must specify for a given input state and input charac- 
ter, a new state, an output character and a direction 
for the tape in which to move. Hence a transition can 
be specified by a quintuple of the form 


(State, Input, Output, Direction, NewState) 


For this reason we will often refer to a transition 
in a given machine as t($,I,O,D, NS). 

Some varieties of Turing machines only allow a 
transition to write a new character on the tape or to 
move, and not both; for such machines, clearly only 
a tuple of 4 elements is required. For our purposes, 
the important thing to note is that given some nota- 
tional convention for identifying the start state and 
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halting state (here we denote the start state as a and 
the halting state as h), a Turing machine can be char- 
acterised by the tuples which make up the definition 
of ô. 

Note also that there are no transitions for state h, 
and that as 6 is a partial function, there is at most 
one transition for a given pair of a state and a symbol 
in the tape alphabet. 


Definition 3 Let (S,I,0,D,NS) be a transition in 
a Turing machine M. We say this is a halting tran- 
sition if NS =h. 

We call a Turing machine M 


e normal if there is exactly one halting transition 
in M. 

e exhaustive if ô is a total function from Q x T 
to QU{h} xT x {lr}, ie. that Yq E Q W € 
T Ad’ € QU{h}, y ET and De {l,r} such that 
(4,7) = (dy, D). 


Note also that machines which are exhaustive but 
not normal are either guaranteed not to terminate (as 
there is no transition into the halting state and every 
combination of state and input symbol has a transi- 
tion defined for it), or have multiple halting transi- 
tions, of which at most only one can ever be used, 
making the other halting transitions spurious. In our 
terminology, Wolfram’s machines are exhaustive, but 
not normal. We in general will be interested in nor- 
mal machines, but not necessarily exhaustive ones. 

We denote by an n-state Turing machine one in 
which |Q| = n. In other words, an n-state Turing 
machine has n “real” states and a halt state. 

As we will often be discussing universality, which 
necessarily involves taking one Turing machine as the 
input machine and another as the (potentially) uni- 
versal machine, we will use the term input machine 
to refer to a machine whose computation is to be sim- 
ulated and candidate machine to refer to the poten- 
tially universal machine. In other words, when dis- 
cussing whether the computation of Mı on w is sim- 
ulated by Mz on code(M,)w (or whatever input is 
appropriate), we refer to M, as the input machine 
and Mp» as the candidate machine. 

Following the busy beaver linguistic precedent®, 
we will call universal machines universal unicorns. 


4 Searching Classes of Machines 


When generating classes of machines to be tested, it 
is tempting to directly re-use the classes already gen- 
erated for the busy beaver searches. However, these 
classes are understandably optimised for searching for 
busy beaver machines. As the number of machines of 
a given size is hyperfactorial (i.e. O(n”)), it is clearly 
important to minimise the number of machines that 
need to be searched. 

One way in which this is done for busy beaver ma- 
chines is to make strong use of the fact that the initial 
input is blank. Given that the tape is symmetric, the 
direction moved by the initial transition may be ar- 
bitrarily chosen, and is usually chosen as r.° 

Let us assume that there are only two tape sym- 
bols, including blank, which we denote as {0, 1} where 
0 is the blank (this convention is standard in the busy 


5 Also known as the law of alliterative adjectives. 

® Hence any machine in this class has a sinister sibling machine 
in which the direction moved by each transition, including the first, 
is reversed. The original machine and its sinister sibling will clearly 
behave identically apart from the direction of motion of the tape 
head. 


beaver literature). If we denote the initial transi- 
tion as t(a,0,O,r, NS) (note that I must be 0 for 
the first transition used, as the input is blank), then 
if NS = a, then the machine will loop infinitely to 
the right, no matter what the value of O is, as the 
computation is started on a blank tape. Hence it is 
sensible to insist that N.S = b,’, and so we have the 
first transition as t(a, 0, O,r, b). Now either O = 0 or 
O = 1. If O=0, then note that the first step of the 
machine does not change the tape at all. Hence we 
could replace this machine with the one that results 
from swapping the state a with the first state S for 
which we have t($,0,1,D, NS), i.e. the first state for 
which J = 0 and O = 1. If there is no such state, then 
the machine cannot print any 1’s, and hence is not of 
interest. The new machine is then one that prints a 
1 as a result of the first transition, and which other- 
wise has the same productivity as the original. This 
means that for busy beaver machines, we can always 
assume the first transition is t(a, 0, 1, r, b). 

Another consideration is that busy beaver ma- 
chines are assumed to be normal and exhaustive. Be- 
ing normal ensures that there is a way for the machine 
to terminate, which is obviously vital for busy beaver 
machines. Being exhaustive is a property that helps 
with maximising productivity. A terminating ma- 
chine that is not exhaustive is in some sense wasteful. 
For example, consider a machine which contains the 
transition t(c,0,1,r,h) but has no transitions for state 
b with input 1 and state c with input 1 (which may 
be considered as having three halting transitions). By 
halting when the machine first encounters 0 in state 
c, the machine is in some sense “missing out” when 
compared to a machine which defines a new transition 
for 0 in state c, and then continues, in the knowledge 
that there are still at least two possibilities for the 
halt transition (i.e. 1 in state c and 1 in state b). 
In other words, if there are less than n x m transi- 
tions, say k, then there is another machine with the 
same k transitions and with another added which will 
have higher productivity. Hence, whilst there is an 
explicit halt transition in these machines, the genera- 
tion of the machines is carefully constrained to ensure 
that each machine contains exactly n x m transitions. 
Effectively this means that there are ‘only’ n x m — 2 
transitions that need to be generated, as the first one 
is fixed, and when there is only one transition left 
to be specified, not only must this be the halt tran- 
sition, it can always be assumed to be of the form 
t($1,41,1,r,h), where h is the halt state (i.e. there are 
no transitions from state h). The direction is again 
arbitrary (as the machine halts after this transition, 
the position of the tape pointer is irrelevant), and as 
it is desired to maximise the number of non-blank 
symbols printed by the machine, it is always sensible 
for the output to be 1. 

A third issue is the technique that is generally 
used to generate machines known as tree normal form 
(TNF) (Lin and Rado 1964). This may be thought 
of as exploiting the property that machines are nor- 
mal and exhaustive, in that machines are generated 
by computing with partially defined machines and 
adding transitions when a combination of state and 
input is encountered which is not yet defined in the 
machine. Initially the machine consists of just the 
transition t(a,0,1,r,b). Executing the partial ma- 
chine on a blank tape gives us the configuration 1{b}0 
(where we use the convention that the tape pointer is 


“Note that when generating such machines, we use the method- 
ology that the first state that is not a is b, the first state that is 
neither a nor b is c, and so forth. Otherwise, we run into unneces- 
sary combinatorial explosions. 

8Whilst this informal argument seems reasonable, it should be 
said that to the best of the author’s knowledge, this argument has 
not been formalised. 
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pointing to the cell immediately to the right of {b}). 
As we have no transition defined for the state b with 
input 0, we choose values for O, D and NS, and add 
the transition t(b,0,O,D,NS) to the machine. We 
then update the configuration according to the newly 
generated transition and continue. This process con- 
tinues until we have only two possible transitions left. 
Once we allocate one of these, we then know that the 
final remaining combination of state and input (say 
Sn and In) must be for the halting transition, and 
so we complete the machine by adding the transition 
t(Sp, th, 1, T, h). 

When searching for universal unicorns, the as- 
sumptions on which these simplifications are based 
do not apply. Firstly, the input is not always going to 
be blank, and so the first transition will not always be 
t(a,0,1,r,b). Secondly, it is not obvious that a uni- 
versal machine will be exhaustive, i,.e. contain n x m 
transitions?, and so we will need to allow the gener- 
ation of machines which have between 2 and n x m 
transitions in them. Also, the halt transition can- 
not be assumed to be of the form t(S1, ⁄1,1,r, h), as 
the final state of the machine may require the tape 
pointer to move either left or right or to output an 
arbitrary symbol in order to terminate with an ap- 
propriately encoded state of the machine. Finally the 
TNF method, which works very well for busy beaver 
machines, cannot be used here, both because we can- 
not assume that the input is blank (and hence we 
do not know which transition will be used first) and 
also because we do not assume that the machine must 
be exhaustive. The latter objection could be over- 
come by allowing the halting transitions to be cho- 
sen ‘freely’, rather than always being the last to be 
chosen. However, the former objection is more funda- 
mental, as the TNF technique is very much predicated 
on the knowledge of a single fixed input, which will 
clearly not apply to a universal machine. 

Hence whilst the results of the search for busy 
beavers will provide some useful data against which 
to compare results, the search space is smaller, and so 
we will need to perform a larger search for universal 
unicorns. In particular, we will need to ‘freely’ gen- 
erate machines, rather than in the sequential manner 
implied by the TNF process. For example, the TNF 
process for 4 states and 2 symbols generates 527,590 
machines to be tested. In the universal case, we will 
require 8 transitions, each of which will have 2 (for 
O) x 2 (for D) x 5 (for NS) = 20 possibilities. Note 
that there are 5 possibilities for NS, as it is possi- 
ble for any given transition to be the halting transi- 
tion. This means that there are 208 = 28 x 108 = 
25, 600, 000, 000 machines to be searched. It may be 
possible to reduce this number by some clever analy- 
sis (such as not allowing the first transition used to be 
the halting transition), but in any case it is clear that 
this will involve a substantially larger search than the 
busy beaver case. 


5 Encoding Machines 


A key issue, as we have seen, is the precise properties 
of the encoding function. In this section we discuss 
various issues related to this function. 


5.1 Coding Functions 


A natural place to start looking for such functions 
is to see what functions have been used in the lit- 
erature. In particular, there are a large number of 
textbooks which discuss universal Turing machines, 


9 Although this is probably likely, especially for small machines. 
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and whilst most such treatments give only an infor- 
mal description of a universal Turing machine, most 
such accounts explicitly define a coding function. 

A good example is one given by Sudkamp (Sud- 
kamp 2005) (3rd edition, pp. 327-328). Sudkamp in- 
formally describes a 3-tape universal Turing machine, 
in which Tape 3 contains the simulated tape of M and 
hence has w on it at the beginning of computation, 
Tape 2 has the current state and Tape 1 has the en- 
coded version of M. This machine is described rather 
than explicitly given. However, our main interest here 
is the coding function, which is as follows (both the 
original machine and universal machine use the input 
alphabet {0,1} and the tape alphabet {B,0,1}). We 
will define the encoding in terms of three functions 
inputs(), states() and ops(), as below. 


inputs(0) = 1 states(qo) = 1 
inputs(1) = 11 states(qı) = 11 
inputs(B) = 111 ... 

states(qn) = 1°"* 
ops(L) = 1 ops(R) = 11 


A transition t(S,I,O,D, NS) is encoded as the 
concatenation of the encoding of each of its elements, 
with a 0 inserted between each element. Hence a tran- 
sition is encoded as below. 


states(S)Oinputs(I)Ostates(N S)Oinputs(O)0ops(D) 


0 is used to separate components of a transition, 
00 is used to separate transitions, and 000 is used to 
indicate the start and end of the encoded machine. 

Hence if M = |t;,...t,], then we have (writing 
tr(t;) for the encoding above): 


code(M) = 000tr (t1 )00tr(tz)00 .. . 0Otr (tn )000 


Note that this may be classified as a unary code, 
in that a variable number of 1’s is used to specify 
which of the different states, inputs and directions is 
meant, and the 0 is used only as a separator. It is 
interesting to note that many textbooks have either 
a similar code (i.e. some variation of a unary code), 
or a code which involves a large number of symbols, 
which is designed to simplify the construction of the 
universal machine. Given that we will be testing a 
given machine for universality, we are not in a position 
to choose the number of symbols that may be used, 
and so we will have to allow for a number of possible 
codes. 


5.2  Unary Codes 


With this example code in mind, it is not hard to 
specify some general conditions for unary codes to 
ensure some minimal ‘sensible’ properties. 

Let S be the set of states used in all machines in 
M (so that for any M € M, the states of M C S). 
Let names(S) be an ordered sequence of names for 
elements of S; for convenience, we will assume that 
names(S) is the sequence s1, $2, 53,.... If S is finite 
then clearly so is names(S). Hence for any s € S, 
there is an 7 such that name(s) = si. We denote by 
n(s) the sequence 51, S2,...5;. Let Ops be the set 
of all operations used in all machines in M. Usually 
Ops = {l,r}. 

A coding function is then defined in terms of 
three components: a mapping on S, a mapping on 
A and a mapping on Ops, denoted as states, in- 
puts and ops respectively. We also assume that the 
coding function is generated homomorphically from 
its definition on individual transitions. This elim- 
inates some of Colmerauer’s ‘cheat’ cases, in that 
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there can be no definitions like “if the machine is 
U then the empty string else ...”. This means that 
if M has the transitions t1,...tn, then code(M) = 
code(t )code(tz) ... code(tn). 

To avoid ‘lazy’ codes, which for example encode 
the input symbol 1 as an entire machine, we insist on 
the following properties: 


e For all s € S, |states(s)| < |n(s)| +1 
e For alli € A, |inputs(i)| < |A] +1 
e For all o € Ops, |ops(o)| < |Ops| +1 


This means that codes can be at worst unary, 
and generally better than that. For example, if our 
“target” universal machine has tape alphabet © = 
{x,y}, then a maximally “worst” permissible coding 
would be one that maps the states {s1,2,...5n} to 
{yz, yyZ, yyyr,...y" x}. 

Given such a unary code, it is not hard to see 
that we do not need to consider the particular way 
in which the symbols are used. If the language has 
at least three symbols (including the blank symbol), 
which we will denote as {B,0,1}, then it is not hard 
to see that the it does not matter whether we choose 
0 as the ‘separator’ and 1 as the ‘counter’, as above, 
or 1 as the separator and 0 as the counter. In par- 
ticular, if M is a Turing machine and we denote the 
former alternative as code(), then for any machine M 
which is universal with respect to code(), then there 
is a machine M~ which is universal with respect to 
code~ (), where code” () is the same as code(), except 
that the roles of 0 and 1 are interchanged, and M7 is 
M under the following transformation: 


For each transition t(S,1,O0,D,NS) in M, we have 
a transition t(S, swap(1), swap(O), D, NS) in M7, 
where swap(0) = 1 and swap(1) = 0. 


When the language contains only two symbols (in- 
cluding a blank), then the same transformation can 
be performed. In fact, if we denote the language (per- 
haps confusingly!) as {B,1}, then it is arguably sim- 
pler to use 1 as the separator and B as the counter, as 
then the representation of a machine on an otherwise 
blank tape will be 


111B1B1BB1BB1B11BB...111 


Whichever version is preferred, we can identify a 
‘canonical’ unary code in this way, knowing that if 
there is a universal machine for one version of the 
code, then there will be a universal machine for the 
other version. There is some potential for variance 
amongst the way that transitions are separated and 
whether the 000 at the start and end of the machine 
is strictly necessary. However, it is clear that one 
does not need more symbols used in this way than 
in Sudkamp’s code, and arguably less. In addition, 
we clearly require at least one ‘separator’ between el- 
ements of an encoded transition and at least one at 
the beginning and end of the machine. This means 
that any savings on such separators will be at most 
one symbol per transition plus 4 overall (as the 000 
and 000 at the start and end of the machine could be 
just 0 and 0). As there are three different kinds of 
separator needed (between elements of a transition, 
between transitions, and at the start and end of a 
machine), there are four variations of this code that 
seem reasonable to try: 


1. All separators are distinct (as above) 
2. Two are the same and one is not 


3. All separators are the same 


Hence if we denote the separator between elements 
of a transition as Sep,, between transitions as Sepo, 
and at the start and end of a machine as Sep3, we 
have the following four cases: 


1. Sep; = 0, Sepo = 00, Seps = 000 (as in Sud- 
kamp) 


2. Sep, = Seps = 0, Sep3 = 00 
3. Sep, = 0, Sep2 = Sep3 = 00 
4. Sep, Sep3 = 0 


Sep2 


It is possible that there could be other codes, such 
as using Sep, = 000000, but we are assuming that 
this would not be used in preference to one of the 
shorter codes above. This means that we could rea- 
sonably limit the search for a universal Turing ma- 
chine to one of the above four cases, and to the case 
when 0 is the separator symbol and 1 is the counter 
symbol. 


5.3 Binary Codes 


A natural generalisation of unary codes is to use bi- 
nary ones. To represent an n-state m-symbol machine 
in an alphabet containing k symbols, we require the 
following: 


Element S I O D NS 
Possible cases n m m 2 n+l1 

Note that n+ 1 possibilities are needed for N S rather 
than n as we need to allow for the halting state. This 


means that each transition will require 
[log, n]+ log, m]+|log, m]+ [log, 2]+[log,(n+1)] 


symbols to encode it, and so each machine requires 
up to 


nm/([log, n] + flog, (n + 1)] + 2/log, m] + 1) 


symbols (note that flog, 2] = 1 as k > 2). We are 
assuming that there are no separators necessary here; 
either we add them as above, or we can assume that 
these are not necessary if n and m are known in ad- 
vance. Knowing n and m in advance means that we 
can determine the exact number of symbols used to 
represent each of the elements of a transition, the 
length of each encoded transition and the maximum 
number of transitions, and hence no separators are 
needed. 

Hence given k symbols, in principle one can look 
at codes of dimension 1,2,...k. It is tempting to 
say that a code of dimension k — 1 is unlikely to be 
used if one of dimension k is possible; put a little 
more starkly, why would anyone use a unary code if 
a binary one could be used? Whilst this is a reason- 
able attitude for humans creating universal machines, 
when it comes to searching a given class for univer- 
sality, it seems highly appropriate to allow for such 
‘improbable’ codes. One lesson that can be drawn 
from the search for busy beavers is that humans are 
not very good at constructing machines with complex 
behaviours and a limited number of states, and so we 
should be very careful about what assumptions are 
made (and be very precise in stating exactly what 
they are). 

Now if our candidate machine M is used as an 
input machine, then we can assume that k = m. This 
means that each machine will require at most 


nm([logm, n] + [login(m + 1)] + 3) 
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symbols, as log,, m = 1. Now if we assume that n < 
m, and son+1<m, this becomes nm(1+1+3) = 
onm. 

This shows that machines which have more sym- 
bols than states can be encoded more compactly than 
machines with more states than symbols. This in it- 
self is not a particularly deep observation; however, 
this may help to explain a rather mysterious prop- 
erty of the busy beaver machines, which is that the 
n-state 2-symbol machines seem to be a less complex 
class than the 2-state n-symbol machines. Consider 
the table below. 


States Symbols Productivity Steps 

2 2 4 6 

3 2 6 21 

2 3 9 38 

4 2 13 107 

2 4 2,050 3,932,964 

3 3 374,676,383 119,112,334,170,342,540 
5 2 > 4098 > 47,176,870 

2 5 > 1.7 x 10352 > 1.9 x 1074 


This greater productivity may reflect the greater 
compactness of representation possible. More par- 
ticularly for the search for universal machines, this 
suggests that we are more likely to find universal ma- 
chines in the 2-state n-symbol class than the n-state 
2-symbol class. 


6 Process 


So in order to perform an automated search for uni- 
versality, we need to do the following: 


1. Determine the appropriate definition of input 
machines. This will include whether or not the 
machine is deterministic, how many tapes, and 
tape heads it has, whether the tape (or tapes) is 
infinite in one direction only or both directions, 
and whether or not an explicit halting transition 
is required. 


2. Determine the appropriate definition of candi- 
date machines. Generally this would be the same 
class of machines as the input machines, but this 
method will allow for possibilities such as Wol- 
fram’s universal machines, which always do not 
terminate, and terminating machines are simu- 
lated by infinitely repeating the encoding of the 
final configuration of the terminating machine. 


3. Determine whether or not universal machines 
are required to be faithful. This may often be 
implicit from the class of machines chosen in 2 
above, but it seems appropriate to insist that an 
explicit statement be made. 


4. Determine the precise notion of universality; in 
particular, it should be stated whether this no- 
tion is result-oriented or process-oriented. 


5. Determine the architecture. This involves choos- 
ing the general strategy of how the input ma- 
chine will be presented to the candidate machine, 
such as how many copies of the input machine are 
used, and where the copy or copies will be placed 
on the tape. We will assume that only one copy 
is used unless explicitly stated otherwise. 


6. Determine the encoding. If there are k symbols in 
the language of the candidate machine, it seems 
that it will be necessary to try at least one i-ary 
code for each 1 < i < k. 
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7. Determine an appropriate criterion to be applied 
to the class of candidate machines. This is likely 
to be a necessary but not sufficient test (like the 
pseudo-universality discussed above). 


We refer to 1-6 above as the universality context 
in which the universality criterion is evaluated. 

If we find a machine which passes the test in 7, 
then either we have found a universal machine, or 
at least a likely candidate for one, which may re- 
quire further analysis. Machines which fail the test 
can be definitively stated to be not universal for this 
universality context, and for any reasonable criterion 
and universality context, it seems reasonable to ex- 
pect that most machines will fail. However, in order 
to state that a machine is not universal for any uni- 
versality context will clearly take significantly more 
work. 

The above process will be used to design imple- 
mentations which generate appropriate classes of ma- 
chines and test them for universality. We have com- 
menced work on such an implementation, but it is not 
yet complete. 

It should be noted that such empirical searches 
will always have limitations; in this case, it would 
appear to be the variety of different universality con- 
texts, and in particular, the variety of possible encod- 
ing functions. However, it is hoped that even some 
partial success in this way will help shed some light 
on the issue of how small universal machines can be 
expected to be. 


7 Implementation 


We are developing a prototype implementation which 
will initially use the following universality context: 


. normal exhaustive machines 
. normal exhaustive machines 


. faithful machines 


1 
2 
3 
4. process-oriented universality 
5. Colmerauer’s architecture 

6 


. Sudkamp’s encoding 


We will use the pseudo-universality criterion of 
Definition 1 in this universality context. 

We will first apply this universality criterion in this 
universality context to the class of 2-state 2-symbol 
machines. Whilst it is known that there are no uni- 
versal machines in this class, this will give us some 
indication of how precise our criterion for pseudo- 
universality is. As we are clearly unable to directly 
test M on code*(M) for all k > 1, we will test all 
values of k up to some maximum value, such as 10. 

The most difficult task will be to evaluate whether 
the computation of M on U is simulated by the com- 
putation of M on code’(M). As we are requiring can- 
didate machines to be faithful, one first test we can 
apply is whether the termination behaviour of M on 
U and M on code’(M) ‘match’. In other words, if 
M on U terminates, then so should M on code’(M); 
otherwise, M is clearly not universal. Similar com- 
ments apply if M on U does not terminate but M 
on code'(M) does. When both terminate, we may 
also conclude that M is not universal if the length of 
computation of M on U is longer than that of M on 
code’(M). As we are using process-oriented univer- 
sality, for any universal machine M, the computation 
of M on code’(M) must be at least as long as that of 
M on U (and possibly longer). 


Proceedings of the Seventeenth Computing: The Australasian Theory Symposium (CATS 2011), Perth, Australia 


Further analysis will presumably require a more 
direct comparison of the computation of M on U and 
M on code'(M), to check that each configuration of 
M on U appears in encoded form in the computation 
of M on cod (M). 


8 Issues 


A number of issues remain to be settled. We discuss 
a few of them here. 

Is universality decidable? 

This is not a question that appears to have been ad- 
dressed previously, to the best of the author’s knowl- 
edge. It would seem likely that the answer is no, 
i.e. that the universality of a given machine, for any 
universality context, is undecidable. Note that this 
is a more general problem than the one being ad- 
dressed above, as this problem does not restrict the 
size of the machine M. In other words, if M is the 
class of all Turing machines of any size (however Tur- 
ing machines are defined), the problem is to deter- 
mine whether an algorithm exists or not to determine 
whether any given M € M is universal (however uni- 
versality is defined). 

Similar questions arise about the simulation of M 
on U by M on code*(M), i.e. whether or not it is possi- 
ble to give an algorithm for any M € M to determine 
whether the computation of M on U is simulated by 
M on code*(M). This problem has some minor vari- 
ations, such as whether this holds for a given value 
of k, or whether it holds for some k, or for all k. It 
is well-known that it is undecidable whether M on 
U terminates or not (Sudkamp 2005). However, note 
that this problem involves a relative judgement, in 
that it is not whether the computation of M on U 
terminates or not, but whether this computation is 
simulated by M on code*(M). This means that the 
decidability of this problem is not obviously inconsis- 
tent with the undecidability of the termination of M 
on U. On the other hand, it seems intuitively likely 
that this problem is undecidable. Similar remarks ap- 
ply to the more general question of whether it is de- 
cidable that the simulation of Mı on w is simulated 
by Mə on code(M,)w. 

Note that even if all of these problems are undecid- 
able, this does not render our quest hopeless. As we 
are evaluating machines of a fixed size, there are only 
a finite number of machines under consideration at 
any particular time. This means that we are address- 
ing a decidable instance of the more general problem. 
In particular, even if the general problem of testing for 
universality (i.e. over all possible Turing machines) is 
undecidable, testing for universality over a finite set 
of machines is decidable. 

It should be noted that a property that we call 
the finite decision anomaly will apply here. This 
is that any decision problem over any finite set is de- 
cidable, as there are only a finite number of possible 
decisions — in particular, for a set of n elements and a 
decision with 2 outcomes, there are 2” possible deci- 
sions, and for each one of these there is an algorithm 
that appropriately assigns ‘yes’ or ‘no’ for each ele- 
ment of the set. For the universality test, this means 
that for any class of machines Mn of size no more than 
n, there is always a decision procedure for universal- 
ity.!° Hence even if the general test for universality 
is undecidable, then it is still possible to determine 
universality up to any given size of machine. This 
would be entirely analogous to the busy beaver prob- 
lem, for which the general problem is undecidable, 
but for any given k, there is an algorithm to compute 


10 Although, somewhat paradoxically, knowing this gives us ab- 
solutely no information as to what the decision procedure may be. 


the corresponding busy beaver value. Hence although 
there is no general algorithm, there is an algorithm 
for any given input value. This may be thought of as 
a concept of infinity which is particularly apt for the- 
oretical computer science, in that for any finite value 
k we can compute the busy beaver value, but there is 
no single algorithm which will compute this value for 
an arbitrary n. 

What proportion of machines are universal? 
If we denote by Mn the set of all Turing machines of 
size < n, for a given universal context, we may define 
the universal proportion as the ratio 


[Um] 
|Mnl] 


where Um, is the set of all universal machines in Mn. 
This leads us to ask a number of questions. For a 
given set of machines Mp, is there at least one univer- 
sal context for which the universal proportion is non- 
zero? Is there more than one such universal context? 
What is the maximum universal proportion within a 
given universal context but across all applicable en- 
coding functions? What is the maximum universal 
proportion across all universal contexts? 

Criteria for universality 
We have seen how Colmerauer’s property can be used 
to derive a pseudo-universality test. Another possibil- 
ity is to determine some other ‘test’? machine M and 
input w and to determine whether U on code(M)w 
simulates M on w. It is tempting to choose M and w 
so that the computation of M on w is rather complex, 
on the grounds that if the corresponding computation 
of U on code(M)w is not as complex, then U cannot 
be universal. However, this means that the evalua- 
tion of universality is potentially quite complicated. 
Hence finding such an M and w will require some 
care. 

Busy beavers on universal machines 
If we have a universal machine U, then U on code( M) 
simulates M on U. Hence if M is a machine of size 
< k, then code( M) is also of a limited size, and so we 
can evaluate the busy beaver value for M’s class of 
machines by evaluating a finite number of inputs to 
U. This suggests an alternative method for evaluating 
busy beaver machines may be to find an appropriate 
encoding and use it to find inputs for U. This is very 
much in the spirit of Bátfai (Bátfai 2009), who has 
shown how by ‘recombining’ machines and slightly 
modifying the class of machines used (such as allow- 
ing a transition to leave the tape pointer stationary, 
and not allowing for an explicit halting transition) 
it is possible to generate machines with greater pro- 
ductivities than the known busy beaver champions. 
This suggests that it may be interesting to evaluate 
busy beaver candidates by representing them as an 
appropriately size-restricted set of inputs to a uni- 
versal machine (such as Colmerauer’s machine). In 
particular, it seems natural to ask what is the small- 
est universal machine which is capable of simulating 
busy beaver machines. As a given size of machine will 
require only inputs of up to a given size (and hence 
a finite number overall), a ‘fully universal’ machine 
may not be required, but only a machine which can 
simulate machines of up to a given size on the blank 
input. Whether the minimal such machine is smaller 
than the smallest universal machine remains an open 
question. 


9 Conclusions and Further Work 
We have seen how the process of discovering univer- 


sal machines, rather than constructing them, leads to 
a number of issues that need to be addressed before 
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an automated search can be carried out. We consider 
this the first step in a long process of experimentation 
to examine various classes of machines and apply var- 
ious criteria for universality. Part of this process will 
be to examine various universality contexts and to 
investigate the differences between them, such as the 
proportion of universal machines found. 

The question of the decidability of universality re- 
mains to be settled. Assuming this is answered in 
the negative, it then leads to the question of what 
criteria for pseudo-universality may be appropriate. 
We have discussed one such criterion, but there are 
presumably several others. 

Another intriguing possibility is to consider two- 
dimensional Turing machines, i.e. machines which 
use a rectangular working space, rather the one- 
dimensional tape. In principle, such machines do not 
extend the capabilities of one-dimensional Turing ma- 
chines. However, this does not preclude the possibil- 
ity that universal machines may be smaller for such 
machines, or have some other differences that make 
them easier to discover. 
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